† Corresponding author. E-mail:
Project supported by the National Natural Science Foundation of China (Grant Nos. 11805128, 11875042, 11505114, and 10975099), the Program for Professor of Special Appointment (Orientational Scholar) at Shanghai Institutions of Higher Learning, China (Grant Nos. D-USST02 and QD2015016), and the Shanghai Project for Construction of Top Disciplines, China (Grant No. USST-SYS-01).
A complex system contains generally many elements that are networked by their couplings. The time series of output records of the system’s dynamical process is subsequently a cooperative result of the couplings. Discovering the coupling structure stored in the time series is an essential task in time series analysis. However, in the currently used methods for time series analysis the structural information is merged completely by the procedure of statistical average. We propose a concept called mode network to preserve the structural information. Firstly, a time series is decomposed into intrinsic mode functions and residue by means of the empirical mode decomposition solution. The mode functions are employed to represent the contributions from different elements of the system. Each mode function is regarded as a mono-variate time series. All the mode functions form a multivariate time series. Secondly, the co-occurrences between all the mode functions are then used to construct a threshold network (mode network) to display the coupling structure. This method is illustrated by investigating gait time series. It is found that a walk trial can be separated into three stages. In the beginning stage, the residue component dominates the series, which is replaced by the mode function numbered M14 with peaks covering ∼680 strides (∼12 min) in the second stage. In the final stage more and more mode functions join into the backbone. The changes of coupling structure are mainly induced by the co-occurrent strengths of the mode functions numbered as M11, M12, M13, and M14, with peaks covering 200–700 strides. Hence, the mode network can display the rich and dynamical patterns of the coupling structure. This approach can be extended to investigate other complex systems such as the oil price and the stock market price series.
Records of dynamical process of a complex system form a sequential series. Time series analysis tries to discover non-trivial information from the series. It can not only shed light on the underlying dynamical mechanism, but also provide information for prediction, regulation and even control.[1] A time series is generally a result of couplings between many elements. For instance, a person’s walk is realized by the cooperation of the heart, lung, neural system and brain, etc.[2] The gait time series contains subsequently many components that come from the organs and their interactions. Each organ has its own dynamical behaviors such as the characteristic frequency and the specific mechanism in coupling with the other organs. The cross correlation between the components has generally a complicated and dynamical pattern. However, in the standard tools in literature, the statistical properties of moments and short/long-term persistence are obtained with the statistical average. This procedure of average dismissed completely the rich patterns.
In this paper, a new concept is proposed to preserve the patterns of cross correlation in time series. The method is illustrated with a case study of gait time series. Firstly, the empirical mode decomposition (EMD)[3] is adopted to obtain the modes (intrinsic mode functions (IMFs) and residue) in gait time series. The modes form a multivariate time series. Each mode is here represented with a variable. Secondly, the multivariate series is cut into non-overlapping segments with a specific length. For each segment, the overlapping degree between each pair of modes is calculated to measure the co-occurrent relationship between them. The co-occurrences between all the pairs of modes form a relationship matrix, which is employed to describe the state of the volunteer in the corresponding time duration. Thirdly, a threshold is then introduced to filter out the weak relationships, which results into a complex network.[4,5] All the networks form a series of networks (temporal network) that present us the kinetic trajectory of the volunteer’s walk.
It is found that with the increase of time, the network structure formed by the patterns of co-occurrence of modes becomes more and more complicated. According to the fluctuation of number of edges, the walk trial is separated into three stages. The core of the mode network (backbone formed by hubs and strong linkages) changes in the number of hubs and the position on the network.
The recordings for the trails of long-term gait dynamics are downloaded from the public accessible free-database of PhysioBank.[6] A total of ten healthy subjects take part in the experiment. The subjects do not have any history of neuromuscular, respiratory or cardiovascular disease and are not taking any medication. The average age, height and weight are 21.7 years (range: 18–29 years), 1.77± 0.08 m and 71.8± 10.7 kg (± standard deviation), respectively. An informed written consent is attached for each subject.
Each volunteer walks three times at normal, slow and fast paces, respectively. Each walk continues about one hour. The path is a long (225 m or 400 m) and approximately elliptical loop on a flat, unobstructed area. An ultra-thin and force-sensitive switch is affixed in a shoe to record the stride interval. The experiment provides subsequently a total of 30 recordings of trials. In this study, the recordings for seven volunteers (totally 21 trials) are considered, their identification numbers are 1, 2, 3, 6, 7, 8 and 9 in the database. The other trials are not considered because the EMD method fails in obtaining the relevant modes from these trails.
Huang et al.[3] proposed initially the empirical mode decomposition (EMD) to separate a signal into intrinsic mode functions and residue. An intrinsic mode function (IMF) should be a series that meets two criterions. The one is that the number of extreme values and the number of zero crossings are equal or at most one different. The other is that at any time, the average between the upper envelope formed by the local maxima and the lower envelope formed by the local minima is zero. To be as self-contained as possible, the relevant algorithm is described briefly. Let us consider a gait time series denoted with X(t), t = 1,2,…, T.
Step 1 One identifies all the maxima and minima and links them, which results in the upper envelop Xmax (t) and the lower envelop Xmin (t), respectively. The difference signal h(t) is calculated by subtracting the mean of the two envelopes from the original gait signal, which reads
Step 2 The above step is repeated until the resulting series of h(t) meets the definition of IMF. At the same time, the IMF should retain sufficient physical amplitude and frequency adjustment. This requirement is realized by limiting the value of the normalized squared difference
Step 3 Subtracting from the original gait time series the first IMF, conduction of the first two steps on the resulting series produces the second IMF, denoted with M2(t).
Step 4 Subtracting from the initial gait time series X(t) the obtained IMFs (denoted with M1(t), M2(t), …, Mm(t)), conduction of the first two steps on the resulting time series produces the (m + 1)th component, denoted with Mm+1(t).
The original gait time series can be reconstructed by the following formula,
Herein a new concept called mode network is proposed to illustrate the co-occurrence of the intrinsic mode functions and the remaining term, and its evolutionary behavior.[7,8] For simplicity, the remaining term R(t) (residue) is treated as the ML + 1(t). The formula of original gait time series X(t) is altered to
The co-occurrence between each pair of the IMFs and residue in the jth segment is measured by the overlapping degree,
To expose the backbones of the mode networks, one usually tries to filter out the weak (low-confident) linkages. Several ideas in literature are stimulating.[9,10] Sometimes, there exists a wide specific range of coupling strength, the networks constructed with thresholds selected exhibit qualitatively similar behaviors.[11] Every node is linked with a certain number of its closest neighbors.[12] Or one tries to embed the network in a two-dimensional plane, at the same time to preserve the linkages as possible.[13] In the present study, there appears a crossover in the distribution function of co-occurrence strengthes. The threshold θ is selected to be the strength at the cross-over point, i.e., the linkages whose weights are larger than the critical point θ are preserved and the others filtered out.
The selection of the segment length w should meet several requirements. Firstly, it should be long enough to make sure that the calculated co-occurrence strength has an acceptable sharp confidence interval. It should be short enough to make sure that in the covered time duration the physiological state keeps unchanged, so that the mode network can represent the state. Obviously, it also depends on the time scale of the interested properties of the walk progress. In the present work, the segment width w is chosen to be 112 strides. The time covered by these stride is about 2 min, in which the physiological state generally does not change significantly for a health volunteer. Mathematically, the sample is enough to guarantee a sharp confidence interval of co-occurrent strength (estimated with the definition in this study), as mentioned in the literature.[14] The reason why the specific value of 112 is selected rather than the other values of ∼ 100 is that this length can separate the series with only a few records lost.
Using the trials mentioned in the subsection
The top panel numbered “Raw” in Fig.
The normalized gait series is then converted to a multivariate series with 17 variables, and divided further into a total of 30 segments with length of w = 112 each (see the above explanation). The mode network series reads
Figure
Herein, the heat map is employed to display the evolutionary behavior of the backbone structure. To expose the evolutionary behavior, the linkages should be arranged in a specific order, in which the evolutionary behaviors of a pair of linkages are much more similar when they are positioned more closely. After filtering out the weak co-occurrent strengths with the threshold θ, a total of 70 linkages is preserved. The criterion for a linkage preserved is that it occurs at least in one of the mode networks Cj, j = 1,2,…,30. Firstly, for each linkage one collects the weights in the 30 mode networks. The resulting weight series describes the linkage’s evolution. Secondly, the correlation coefficients between the linkage weight series are calculated. Thirdly, the hierarchical clustering method is used to construct a clustering tree of the linkages (shown in Fig.
The summation of linkage weights that are larger than θ in the mode network is used to measure the global coupling strength, as shown in Fig.
The first stage is the duration from the 3rd to the 12th network (which covers about 1/3 of the total stride time). In this stage, there are a few numbers of edges. That is, only a few of modes dominate the state of the volunteer. Accompanied with Fig.
The second stage starts from the 13th and ends at the 23rd network (which covers about 1/3 of the total stride time). In this stage co-occurrence strength becomes strong and fluctuates with large amplitudes. The 1st, 13th, 14th and 16th modes appear in almost all the mode networks, while the 2nd, 4th, 9th, 11th, 12th and 15th modes just take part in several mode networks.
The last stage covers the mode networks from the 23rd to the end one (which covers about 1/3 of the total stride time). The number of edges increases rapidly, and more and more modes become important in the backbone.
The average for every linkage over the 30 mode networks and its error bar are displayed in Fig.
To display the change of the set of linkages and modes that form the backbone, two statistic properties of each mode are calculated, namely, the degree (amount of its connected neighbors) and the average of the shortest paths from it to all the other modes.[21,22] A node can be selected to be the kernel if it has the largest degree and the smallest average shortest paths. A total of 25 ego networks centered at the kernels are constructed, as shown in Fig.
Summarily, the backbone of mode network structure is significantly different in the three stages, and displays the complicated evolutionary behavior of the physiological state in the walking. However, this rich information on physiological state and its evolution is lost in the the detections of fractal, chaotic, and short-term persist behaviors.
A complex system consists of many elements, between which there exist complicated couplings. The time series of output records for a specific element is actually a cooperative result of all the elements’ dynamical processes. The contributions to the time series from the other elements have different behaviors that are determined by the coupling strengths and patterns and the working mechanisms of the elements. Though there exist many elements, in reality the number of elements that can be monitored is limited (only one or several signals can be measured). Hence, how to detect the coupling pattern from a mono-variate time series is an essential task in time series analysis. It can tell us how the different elements cooperate with each other, and what happens in a specific time duration in the dynamical process. However, the information on this coupling structure is lost completely in the currently used time series analysis due to the procedure of statistical average.
In the present study a new concept called mode network is designed to preserve the coupling structure in a mono-variate time series. It is realized by two successive steps. The first is to decompose the time series into many intrinsic mode functions by means of the empirical mode decomposition. Each of the intrinsic mode functions is a mono-variate time series used to represent the contribution from a specific element. All the mode functions form a multivariate time series. The second is to calculate the co-occurrent strengths between the mode functions in time durations with a specific length. A threshold is then determined to map the co-occurrent matrices into a temporal network, in which each mode function is a node and the co-occurrent strength between a pair of mode functions is the weight of linkage. The backbone that consists of the strong linkages and the corresponding mode functions are used to characterize the coupling structure and its evolution.
This approach is illustrated by a typical example of gait time series. In a walk trial, the physiological state of a volunteer evolves in a complicated way. For instance, in the case shown in the result section, the walk can be separated into three stages. The first stage is dominated by the residue, which can be regarded as a constant in the interested time scale (minute), which is taken over in the middle stage by the mode function with peaks covering ∼10 min. In the final stage the physiological state becomes more and complicated, i.e., more and more mode functions participate in the formation of the backbone. The traditional methods such as the de-trended fluctuation analysis, the wavelet transformation maximum module and the diffusion entropy approach gave controversial conclusions.[23,24] The key assumption in these methods is that the physiological state does not change and obey accordingly an identical law, which is not met as found in this study. It is also found that the change of physiological state is induced mainly by the co-occurrent strengths between the mode functions numbered M11, M12, M13, and M14, whose peaks cover a wide range from 200 to 700 (about 3–12 min).
A natural question is how to understand these findings from the viewpoint of physiology, i.e., what can the findings tell us on the physiological state of the volunteer? It is widely accepted that the IMFs reflect the contributions from different elements of a complex system. This idea is usually illustrated in textbooks with the series synthesized with periodic signals at different frequencies (components). However, the relation between the modes and the components used in the synthetic procedure becomes complicated and non-trivial, when the components are not periodic ones. To our knowledge, there is not report in literature that gives a clear relation between the contributions of organs and the intrinsic mode functions in a walk trial. It is still an open problem.
Some interesting works in literature on the same walk trials show that, in about one hour walk the dynamical mechanism of the motor nervous system (measured by the scale-invariant exponent of time series) can be separated into three stages with significantly different values of the Hurst exponent.[25–28] This finding is consistent with our findings in this study. Hence, it is reasonable to say that the evolutionary behavior of mode network is produced by the dynamical mechanism of walk.
In literature, the empirical decomposition is widely used to classify the gait patterns, e.g., to distinguish patients with Parkinson’s disease from healthy controls.[29] The key idea is to use the IMFs to construct a gait feature vector to represent the behavior of the corresponding individual. Then all kinds of classifiers (e.g., the neural network) are employed to separate the feature vectors into different groups. The mode network in this study provides a matrix of relationships between the IMFs modes, rather than a simple list of IMFs in the feature vector. What is more, our findings show that even for an identical healthy volunteer its physiological state will change significantly along time in a walk trial. Our investigation may find its applications in wearable devices to monitor healthy state in a real-time way.
In literature, there exist some standard decomposition methods to decompose time series into different components, such as the Fourier transformation and the wavelet transformation. The empirical mode has also been modified to improve its performance.[30] In the present work the original version of the empirical mode decomposition is employed to illustrate the proposed framework. It is worth testing the performance of the other decomposition methods in future works.
An interesting topic developed in recent years is the network based time series analysis.[31] The basic idea is to represent the patterns in time series with network and compare the constructed networks to find differences between time series[4,5,32–36] or to monitor the evolutionary behavior of a complex system.[37–39] We hope our work can stimulate detailed and systematic works on preserving patterns of cross correlation in records of dynamical processes of complex systems, such as the oil and stock market price series.
[1] | |
[2] | |
[3] | |
[4] | |
[5] | |
[6] | |
[7] | |
[8] | |
[9] | |
[10] | |
[11] | |
[12] | |
[13] | |
[14] | |
[15] | |
[16] | |
[17] | |
[18] | |
[19] | |
[20] | |
[21] | |
[22] | |
[23] | |
[24] | |
[25] | |
[26] | |
[27] | |
[28] | |
[29] | |
[30] | |
[31] | |
[32] | |
[33] | |
[34] | |
[35] | |
[36] | |
[37] | |
[38] | |
[39] |